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Abstract Diverse animal species use multimodal 
communication signals to coordinate reproductive 
behavior. Despite active research in this field, the brain 
mechanisms underlying multimodal communication 
remain poorly understood. Similar to humans and many 
mammalian species, anurans often produce auditory 
signals accompanied by conspicuous visual cues (e.g, vocal 
sac inflation). In this study, we used video playbacks to 
determine the role of vocal-sac inflation in little torrent 
frogs (Amolops torrentis). Then we exposed females to blank, 
visual, auditory, and audiovisual stimuli and analyzed 
whole brain tissue gene expression changes using RNA- 
seq. The results showed that both auditory cues (ie, male 
advertisement calls) and visual cues were attractive to 
female frogs, although auditory cues were more attractive 
than visual cues. Females preferred simultaneous bimodal 
cues to unimodal cues. The hierarchical clustering of 
differentially expressed genes showed a close relationship 
between neurogenomic states and momentarily expressed 
sexual signals. We also found that the Gene Ontology terms 
and KEGG pathways involved in energy metabolism were 
mostly increased in blank contrast versus visual, acoustic, 
or audiovisual stimuli, indicating that brain energy use 
may play an important role in response to these stimuli. In 
sum, behavioral and neurogenomic responses to acoustic 
and visual cues are correlated in female little torrent frogs. 
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1. Introduction 


Studies elucidating the mechanisms of social behavior, such as 
mate choice and resource competition, are of key importance 
in ecology and evolutionary biology (Toth et al., 2010). While 
progress has already been made, the development of molecular 
techniques promises to provide unprecedented opportunities to 
determine how behavioral patterns and processes are governed 
(Alvarez et al., 2015). Increasingly, studies have used sophisticated 
methods to explore the regulation of specific phenotypes using 
genome-wide approaches. Although the genome has been 
viewed in the past as a passive agent in controlling adult brain 
function (Dong et al., 2009), widespread measurements of gene 
expression in different experimental systems have clearly 
revealed that behavioral activity, perceptual experience, and 
changing social conditions can result in rapid gene expression 
changes in the brain (Clayton, 2000; Robinson et al, 2008). In 
certain environments, animal behavior may evolve through 
changes in specific gene regulation in the brain (Bell and 
Robinson, 2011), yet we still know little about the relationship 
between brain gene expression and social behavior (Dong et al, 
2009; Zayed and Robinson, 2012). 

Multimodal communication has received widespread 
attention in the study of animal behavior. Although many 
animals seem to communicate primarily with signals in a 
single modality (Ryan et al., 2018), an increasing amount of 
studies have indicated that multimodal communication is more 
ubiquitous (Partan and Marler, 1999; Hebets and Papaj, 2004; 
Partan, 2013; Starnberger et al., 2014b). A well-known human 
example is the McGurk effect, in which visual cues associated 
with the facial gestures involved in speech production have 
a profound impact on speech perception (McGurk and 
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MacDonald, 1976; Driver, 1996). The perception of stimuli 
across sensory modalities can improve selective attention, 
signal detection, learning, and memory in humans as well as 
other animal groups (Bahrick et al., 2004; Halfwerk et al., 2019). 
Though it has been the subject of much research, determining 
how the brain integrates signals derived from multiple sensory 
modalities remains challenging. 

Dynamic genome analysis, which is based on new gene 
expression and sequencing methods, has provided an excellent 
opportunity to uncover potential mechanisms involving 
multimodal communication behavior (Partan, 2013). These 
methods have already been applied to the study of the genetic 
basis of acoustic communication in songbirds (Lovell et al, 2008; 
Balakrishnan et al., 2012; Balakrishnan et al, 2013; Balakrishnan 
et al, 2014; Frankl-Vilches et al., 2015), although, as yet, the 
regulatory architecture of the neurogenomic states regulating 
complex behaviors is not well understood. Several single-gene 
studies, however, provide a foundation for using genomic 
techniques to address questions about how the brain processes 
multimodal signaling (Partan, 2013). 

Frogs are excellent model systems for the experimental 
investigation of multimodal communication (Starnberger et 
al, 2014b; Bee, 2015; Stange et al, 2017). Anuran acoustic signals 
can be readily synthesized and, in some species, male sexual 
displays incorporate visual cues that can be used as stimuli 
in playback experiments (Taylor et al., 2008; Starnberger et al., 
20142). Notably, in most anuran species, male vocalizations are 
accompanied by synchronous inflation of the vocal sac. Vocal 
sac inflation may act as a secondary cue as opposed to a signal 
or a signal component. Although the evolved function of the 
vocal sac is to cycle air during calling (Pauly et al, 2006), many 
studies have indicated that its role in mating is to facilitate 
detection and localization through movement and coloration 
(Rosenthal et al, 2004; Taylor et al., 2008; Preininger et al, 2013a; 
Taylor and Ryan, 2013). Thus, vocal sac visual cues could 
act on both female mate choice and male-male interactions 
(Starnberger et al., 2014b). For instance, in Kottigehar dancing 
frogs (Micrixalus kottigeharensis), a pulsating vocal sac induces 
more agonistic behaviors than unimodal acoustic stimuli 
(Preininger et al, 2013b). 

In this study, we focused on the little torrent frog (Amolops 
torrentis), a species endemic to Hainan island that lives along 
mountain streams and calls during the day and at night during 
the breeding season. Males of this species prefer to call from 
stones with the same background color as the frog's body, 
which differs distinctively from the white color of the vocal 
sac. Thus, it is likely that visual cues associated with vocal sac 
inflation could serve as visible signals capable of increasing 
communication effectiveness in little torrent frogs in a noisy 
stream environment. In this study, we first employed video 
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playbacks in behavioral experiments in order to compare the 
sexual attractiveness of visual cues associated with male vocal 
sac movement with or without accompanying acoustic call 
stimuli. Then we presented females with blank, visual, acoustic, 
or audiovisual stimuli and subsequently collected brain tissues 
to obtain whole transcriptomes using RNA-seq. For the first 
time, we assessed whether neurogenomic states correlate 
with audio-visual behavior in order to identify potential 
molecular response mechanisms. In view of the limited research 
background and gene characterization in little torrent frogs, 
we focused on statistical analyses designed to identify broad 
functional expression patterns related to groups of genes and 
on the characterization of whole-genome expression. 


2. Materials and Methods 


2.1. Signal design Male little torrent frogs produce acoustic 
signals during both day and night. The time of day does not 
influence signal properties. During the breeding season, we 
synchronously recorded videos and sounds during daylight 
hours using a Nikon camera (D800) fixed on a tripod, 
connected to a directional microphone (Sennheiser ME66 with 
K6 power module) at the Mt. Diaoluo Nature Reserve (18.44°N 
and 109.52°E), Hainan Province, China. We chose a male calling 
at the site with uniform illumination, not in direct sunlight, 
with no nearby calling conspecific individuals. We obtained a 
video recording of the calling male and a video recording of 
the calling environment with the frog excluded. We used the 
two videos to create three base stimuli that were subsequently 
edited in Adobe Premiere Pro CS6. The base stimuli were (1) an 
18 s video with a calling male in which the vocal sac inflation 
and the acoustics were both present, (2) a 9 s video with a male 
not calling and no vocal sac inflation, and (3) an 18 s video with 
a blank screen on which the frog and the acoustics were both 
absent. The videos with different duration were designed to 
ensure that all stimulus pairs had equal intervals (see below). 

The audio files were edited in Adobe Audition 3.0 after 
being separated from the video tracks, and subsequently 
resynchronized using Adobe Premiere Pro CS6 in order to 
create five stimuli for the present study. These stimuli were (1) 
a 9 s stimulus with a silent frog; (2) a 9 s stimulus with vocal 
sac inflation but no sound; (3) an 18 s stimulus with vocal sac 
inflation but no sound; (4) an 18 s stimulus with a call but 
no vocal sac inflation; and (5) an 18 s stimulus with vocal sac 
inflation accompanied by a call. Analysis of the time-frequency 
domain characteristics showed that all call parameters fell 
within the natural range (Zhao et al., 2017b). 

We conducted four two-choice tests with the above- 
described stimuli. The first experiment involved a 9 s stimulus 
with vocal sac inflation versus a 9 s stimulus with a silent frog, 
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in order to assess whether vocal sac inflation acts as a visual 
signal. The second experiment involved an 18 s stimulus with 
a call versus an 18 s stimulus with vocal sac inflation, in order 
to compare the relative attractiveness of acoustic and visual 
signals. The third and fourth experiments involved an 18 s 
stimulus with both vocal sac inflation and the accompanying 
call versus an 18 s stimulus with only a call and versus an 18 
s stimulus with only vocal sac inflation, respectively. The two 
experiments were used to determine whether acoustic and 
visual cues jointly enhance the attractiveness of the stimulus in 
the other modality. The interval between two stimuli was set 
at 9 s for all stimulus pairs. Consequently, the first experiment 
had a different time unit than the other three experiments (9 s 
andl8 s, respectively). 


22. Playback experiments We performed the behavioral 
experiments at field research bases in the Mt. Diaoluo Nature 
Reserve. Amplexed male and female frogs are generally found 
in rock crevices or holes in the stream. In this study, gravid 
females were collected (between 2000 and 2200 h) from the 


| Vol. 12 


stream and nearby shrubs near the laboratory. Prior to the 
experiment, females were placed in darkness for at least one 
hour to allow their eyes to adapt to the dark experimental 
conditions (Stange et al., 2017). After testing, we measured their 
snout-vent length and body mass and returned them to the 
stream on the same day they were collected. 

We conducted playback experiments in a sound-attenuating 
phonotaxis chamber. Females were tested in a corridor (1.3 
m x 15 m) constructed with foam walls. At each side near 
the corner, an LCD monitor (Philips 17S4LSB) and a speaker 
(JBLCLIP + BLK, JBL) were coupled to broadcast sound and 
video, respectively (Figure 1). For each monitor, an area (14 cm 
x 11 cm) at the bottom left or bottom right was used to present 
the video stimulus, thereby assuring that the apparent body 
size of the video frog was equivalent to that of a live frog, Each 
speaker was fixed alongside the playback area of a coupled 
monitor, thereby making the distance between the center of 
the speaker and the male frog in the video equal to 12 cm. A 


previous anuran study showed that a male frog and a speaker 


Figure 1 Schematic of the acoustic and visual playback arena. The blank rectangle (14 cm x 11 cm) on each screen represents the area 
of presenting video stimuli during two-choice test. The 12 cm is the distance between the center of the speaker and the male frog in 
the video. The image of frog represents the initial female placement point for each playback trial. 
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are not perceived as separate objects if they are located within 
12 cm of each other (Narins et al., 2005). In front of the screens, 
we marked a position 1 m from the two video playback areas 
as the initial female placement point for each playback trial. The 
distance between the two video playback areas also was 1 m, 
resulting in a 60° angle between monitors with respect to the 
marked position. This allowed females to easily see the vocal sac 
and body of the male on both screens (Taylor et al, 2008; Taylor 
et al, 2017). We observed female behavior on a monitor using a 
video system with an infrared light source. 

For each female, calls were played at 75 dB SPL (1 m from 
the speaker), which is near the auditory threshold of the call 
frequency range (Zhao et al., 2017a), and videos were adjusted 
to a dim condition (1 lux on the screen; measured by TES 
1399 Light Meter Pro), which was approximately equal to 
the crepuscular light level at the stream in the frog's natural 
environment, in order to best simulate conditions in which 
visual and acoustic integration might normally occur (Rowe, 
1999; McDonald et al., 2000). Prior to each trial, we used a 
light tight box to restrain the females at the marked position. 
We elevated the box and freed them so that they could 
choose between alternative stimulus pairs while the stimulus 
pairs were broadcast antiphonally from each side. A choice 
was recorded if the female approached a speaker-monitor 
combination within 5 cm. We only scored females who were 
responsive in each of the four experiments. We considered a 
female as lacking motivation if she failed to make a choice 
within 10 min. For each frog, we stochastically presented all 
stimulus pairs in order to avoid potential partial side effects. 
None of the females were re-used for multiple experiments. 
Female choice data were analyzed with the two-tailed exact 
binomial test using R version 3.2.5. 


23. Brain sample collection Females were collected on the 
morning of the day (between 1000 and 1200 h) that they were 
tested. These frogs were not the same set of individuals used 
in the playback experiments. Prior to the test, each female was 
isolated for at least eight hours in a dark soundproof chamber 
in order to bring about a decline in mRNA expression that 
might have been induced at the site of the breeding choruses 
(Burmeister et al., 2008). For each frog, a speaker and a video 
screen were coupled to present one stimulus consisting of 
either a blank contrast, a visual signal, an acoustic signal, or 
an audiovisual signal for 30 min. All animals were randomly 
assigned to different treatment groups. The arena, sound, and 
lighting conditions were the same as described for the previous 
playback experiments. During the playbacks, females were 
confined to a cylindrical cage (diameter = 7 cm, height = 10 cm) 
fixed 1 m from the male in the video. Each frog was alone in 
the cage, and frogs in the cage were able to hear the calls and 
see the visual display stimuli. After the playbacks, the speaker 
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and video screen were immediately turned off, and the frogs 
remained in the dark for 30 min. Previous studies have shown 
that immediate early gene (IEG) mRNA accumulation in the 
frog brain reaches the highest level after 30 min following 
exposure to a continuous stimulus for 30 min (Burmeister et 
al, 2008). Ambient temperatures were maintained at 23-25°C 
during the experiment. 

In order to avoid bias caused by differing female activity 
levels, we only sampled individuals that did not move around 
but were motivated and faced the monitor during the entire 
experiment. After the experiments, the females were euthanized 
with an overdose of MS-222 solution (0.3%), and whole brain 
tissue was quickly collected. In addition to the four groups 
presented with different stimuli (n = 3 samples per group), we 
collected samples from frogs with dark treatment alone as 
a further control (n = 3 samples). The dissection implements 
were treated with Surface RNase Erasol (BioTake Corporation, 
China) according to the manufacturer's instructions, and all 
operations were conducted on ice. Samples were preserved in 
RNA Later (Sigma-Aldrich) and stored at —20°C. 


2.4. RNA extraction, sequencing, and de novo assembly 
Total RNA was isolated using TRIzol’ reagent (Invitrogen, 
CA, USA) according to the manufacturer’s protocols. RNA 
degradation and contamination were analyzed using 1% 
agarose gels. RNA purity, RNA concentration, and RNA 
integrity (RIN scores ranged from 9.0 to 9.6) were tested using 
the NanoPhotometer* spectrophotometer (IMPLEN, CA, 
USA), Qubit® RNA Assay Kit in Qubit” 20 Flurometer (Life 
Technologies, CA, USA), and RNA Nano 6000 Assay Kit of 
the Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, 
USA) respectively. The cDNA libraries were constructed using 
the NEBNext^ Ultra” RNA Library Prep Kit for Illumina” 
(NEB, USA) according to the manufacturer's instructions. All 
libraries were sequenced on an Illumina Hiseq X-ten platform 
(San Diego, CA, USA). Prior to assembly, we obtained clean 
reads by removing low-quality raw reads, reads with an 
adapter, and reads with poly-N, and calculated their Q20, 
GC content, and the number of sequences. The clean data 
and expression profiling data were deposited in NCBI’s Gene 
Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/ 
query/acc.cgi?acc=GSE122947). We completed transcriptome 
assembly based on clean reads using Trinity software with 
min_kmer_cov set to two by default (Grabherr et al., 2011). 
Clean reads from all samples were used to build the reference 
transcriptome in order to avoid biasing results toward different 
samples. 


2.5. Gene function annotation and differential expression 
analyses To obtain comprehensive function information, 
we annotated unigenes against seven databases including the 
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Swiss-Prot, Protein family (Pfam), NCBI non-redundant protein 
sequences (NCBI-NR), NCBI nucleotide sequences (NCBI-NT), 
Gene Ontology (GO), euKaryotic Ortholog Groups (KOG), 
and the Kyoto Encyclopedia of Genes and Genomes (KEGG) 
databases. The Swiss-Prot and NR annotations were performed 
using diamond (v0.8.22) with an e-value < 10? and the KOG 
annotation with an e-value < 107. The Pfam, NT, GO, and the 
KEGG annotations were performed using hmmscan (HMMER 
3.0) with an e-value < 0.01, NCBI blast (v2.2.28+) with an e-value 
< 10°, blast2go (b2g4pipe. v2.5) with an e-value < 10°, and 
KAAS (r140224) with an e-value « 10 respectively. 

We regarded the assembled transcriptome as a reference 
and mapped clean reads of each sample to the reference 
transcriptome using RSEM software (Li and Dewey, 2011). The 
software calculated the number of read counts for each gene. 
Gene expression levels were calculated using the fragments per 
kb per million reads (FPKM). The differentially expressed genes 
(DEGs) that were evoked by different stimuli were analyzed 
using the data of read count. The analysis was completed 
with the DESeq R package (110.1) (Anders and Huber, 2010). 
The resulting P-values were adjusted using the Benjamini 
and Hochberg method for controlling the false discovery rate 
(Storey and Tibshirani, 2003). An adjusted P « 0.05 was assigned 
as differentially expressed levels for the DEGs in response to 
various stimuli. The number of DEGs was compared with a 
two-tailed exact binomial test. 


3. Results 


3.1. Female behavioral responses to different sexual displays 
In present study, video animations evoked female responses 
efficiently, and most individuals jumped on the viewing screen 
directly when making choices. Our previous study indicated 
that male calls (acoustic cues) were an important sexual display 
in A. torrentis (Zhao et al, 20172). Females significantly preferred 
stimuli with vocal sac inflation over those without inflation 
(12/3; Table 1), indicating that vocal sacs are also a visual cue. 
However, females significantly preferred calls to inflated vocal 
sacs (12/3; Table 1). When we presented females with vocal 
sac inflation accompanied by calls (audiovisual cue) paired 
with an alternative of vocal sacs only or calls only, females 
showed a significant preference for the complex multisensory 


Table 1 Summary of video playback experiments for A. torrentis. 
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components (12/3; Table 1). In sum, our results suggest that the 
attractiveness hierarchy of sexual displays is as follows, from 
least attractive to most attractive: blank contrast (in the absence 
of acoustic and visual cue), visual cues, and acoustic cues and 


audiovisual cues. 


32. Neurogenomic and behavioral responses to different 
stimuli are closely correlated Sequencing and de novo 
assembly results are included in Table S1. We analyzed DEGs 
in whole female brains to explore the potential influences of 
specific sexual displays on A. torrentis neurogenomic states. A 
total of 808 DEGs were detected from all whole brain tissues in 
response to different types of stimulation. Hierarchical clustering 
has been used to determine whether brain gene expression 
patterns track behavior in other species (Chandrasekaran et 
al, 2011). We therefore employed this method to reveal brain 
transcriptional profiles of total DEGs across all video playback 
samples, and thus to determine transcriptome responses to 
different behavioral categories. Hierarchical cluster analysis 
indicated a close relationship between brain gene expression 
and behavior (Figure 2). Overall, the transcriptional profiles of 
samples from the same treatment groups were more similar to 
one another than to those from different treatment groups. All 
samples from the same behavioral condition were gathered in a 
distinct cluster, in addition to a blank contrast and an acoustic 
stimulus. Moreover, the cluster of samples exposed to the 
audiovisual stimulus condition lay between the visual stimulus 
cluster and the acoustic stimulus cluster. 

We compared the number of DEGs evoked by multimodal 
cues versus blank contrast and evoked by multimodal cues 
versus unimodal (acoustic or visual) cues. A total of 362 DEGs 
were obtained in the audiovisual versus blank group, while 
only 169 and 67 DEGs were identified in the audiovisual 
versus acoustic group and the audiovisual versus visual group, 
respectively (Table 2). The binomial test showed that the 
number of DEGs found for audiovisual cues versus blank 
contrast was significantly higher than for audiovisual cues 
versus acoustic cues (P < 22 x 105) and for audiovisual cues 
versus visual cues (P < 22 x 10. These results indicated that the 
number of DEGs related to behavioral condition differences, 
and that comparison between approximate signals can produce 
fewer DEGs. 


Test Alternative 1 Alternative 2 Choices P 

1 Vocal sac — 12/3 0.035 
2 Vocal sac Call 3/12 0.035 
3 Call + vocal sac Call 12/3 0.035 
4 Call + vocal sac Vocal sac 12/3 0.035 
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Figure 2 Hierarchical cluster analysis on global gene expression patterns reveals a close relationship between neural genomic re- 
sponses and different communication behaviors. Colors from blue (low expression) to red (high expression) represent the relative ex- 
pression quantity of DEGs. The cluster relationships represent the similarity between different samples and DEGs. Apart from Aland 
B1, the analysis shows five clusters corresponding to dark contrast, blank contrast, visual, auditory, and audiovisual stimuli. D1-D3, 
samples from dark contrast; B1-B3, samples from blank contrast; V1-V3, samples from visual stimuli; A1—A3, samples from audi- 


tory stimuli, AV1—AV3, samples from audiovisual stimuli. 


Table 2 Summary of DEGs in response to different behavioral categories. 


Number Stimuli Total DEGs Up DEGs Down DEGs 
1 AV vs. B 362 113 249 
2 AV vs. A 169 83 86 
3 AV vs. V 67 28 39 
4 Vvs.B 370 121 249 


Note: AV, audiovisual; B, blank; A, auditory; V, visual. 
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3.3. Differentially expressed genes in response to different 
stimuli To detect potential genes involved in the response 
to cues associated with different behavioral conditions, we 
compared whole brain gene expression among the multimodal/ 
unimodal versus blank stimuli conditions and between the 
multimodal versus unimodal stimuli conditions. A total of 370 
DEGs (121 up-regulated and 249 down-regulated genes) were 
identified from the visual versus blank treatments, and a total 
of 388 DEGs (180 up-regulated and 208 down-regulated genes) 
were obtained from the acoustic versus blank treatments. These 
results were not significantly different from those associated 
with the audiovisual versus blank contrast treatments (362 
DEGs: 113 up-regulated and 249 down-regulated genes) 
(binomial test: P > 0.05; Table 2). A total 169 DEGs (83 up- 
regulated and 86 down-regulated genes) were detected from the 
audiovisual versus acoustic treatments, while only 67 DEGs (28 
up-regulated and 39 down-regulated genes) were found for the 
audiovisual versus visual treatments (Table 2). 

To obtain information about the functions of the DEGs, 
we conducted GO analysis focused on three main categories 
including biological processes, molecular functions, and cellular 
components. We identified many significantly enriched GO 
terms in the blank contrast versus visual cue, acoustic cue, or 
audiovisual cue comparisons (Table S2). In these comparisons, 
the up-regulated genes involved in energy metabolism were 
found to be significantly enriched (Table S2). The most 
significantly expressed down-regulated genes were involved in 
various stimulus responses and in lipid or sterol metabolism. 
In the multimodal versus unimodal stimuli comparison, 
however, none of the significant GO terms were found to 
mediate multimodal cue integration specifically (Table S2). 
Expression and gene function of the top 10 DEGs annotated in 
the GO database were then analyzed by adjusting the P-value. 
We found that most significant DEGs were involved in 
energy metabolism, sterol and lipid metabolism, transcription 
and translation, and ion binding and transport in different 
comparisons (ie., blank contrast versus visual cues/acoustic 
cues/audiovisual cues) (Table 3). Specifically, all genes involved 
in energy metabolism were found to be significantly up- 
regulated in these comparisons (Table 3). 

Consistent with GO analysis, the most significantly enriched 
pathways when compared against KEGG databases were all 
related to energy generation among all DEGs associated with 
the response to visual, acoustic, or audiovisual stimuli (Figures 
3A-C). Possibly because the samples were from brain tissue, 
the next most significantly enriched pathways after energy 
metabolism were all related to neurodegenerative diseases such 
as Parkinson' disease (Figures 3A-C). In the multimodal versus 
unimodal stimuli comparison, however, we obtained few 
significantly enriched pathways (Figures 3D and E). 
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4. Discussion 


In anurans, vocal sac movement has traditionally been regarded 
as a by-product of call production. However, an increasing 
number of studies have found that stimuli associated with 
the vocal sac can provide a basis for composite signaling 
during communication (Starnberger et al., 2014a). For example, 
agonistic male interactions in both diurnal dart-poison frogs 
(Allobates femoralis) and Kottigehar dancing frogs (Micrixalus 
kottigeharensis) are only provoked by conspecific calls 
synchronized with vocal sac movement (Narins et al., 2003; 
Narins et al., 2005; Starnberger et al, 2014a). In the Krefft's river 
frog (Phrynobatrachus kref ftii, male-male agonistic behaviors can 
be induced by the dynamic visual signal of vocal sac inflation 
in addition to calls (Starnberger et al, 20142). Nocturnal anuran 
species can also communicate with visual cues associated with 
the vocal sac. For example, vocal sac inflation and coloration 
have been shown to influence female choice in a few nocturnal 
frog species (Rosenthal et al, 2004; Taylor et al., 2008; Gomez et 
al, 2009; Richardson et al, 2009). Thus, the roles played by vocal 
sac traits in social behavior are diverse in anuran groups. 

Many conditions can favor the evolution of multimodal 
communication systems. Stream noise is an important 
environmental factor for torrent frogs because animal 
communication sounds can be masked by high background 
noise. Studies of multimodal communication have also 
emphasized the importance of determining if individual 
signal components are redundant (ie, conveying the same 
information) or nonredundant (i.e., conveying different 
information) (Partan and Marler, 2005). In the present study, 
female little torrent frogs preferred temporally overlapping 
bimodal signals to unimodal signals, suggesting that the 
interaction between acoustic and visual cues can increase 
communication efficiency in noisy stream environments. 
However, both visual vocal sac inflation signals and 
advertisement call acoustic signals alone were sufficient for 
mate attraction. Our results therefore indicate that vocal sac 
inflation transmits at least some of the same information as 
male advertisement calls for sexual selection. In addition, the 
little torrent frog is a territorial species, and males often perform 
additional visual signals such as foot-flagging displays. More 
study is needed to reveal the extent to which vocal sac and 
other visual cues combined with call stimulation affect male- 
male competition and female mate choice in this species. 

Many investigators have used microarrays or RNAseq 
(ie, transcriptomics) to measure brain gene expression related 
to various behavioral traits. An excellent example comes 
from studies of honey bees (Apis mellifera), whose social 
role plasticity is mediated by brain gene expression over 
multiple timescales (Zayed and Robinson, 2012). In the field 
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of acoustic communication, neurogenomic states have been 
used in songbird species to link brain gene expression with 
seasonal singing behavior (Frankl-Vilches et al., 2015) and 
song response habituation (Dong et al., 2009). Previous studies, 
however, have focused primarily on animal behaviors that 
are maintained for reasonably sustained periods. It is therefore 
unclear whether brain transcriptomics are sensitive enough to 
reflect the momentary processes associated with multimodal 
communication in real time. Although this field would seem to 
have potential, it remains to be seen if these methods can reveal 
the genetic bases of complex behaviors (Partan, 2013). 

In this study, we asked whether neural genomic 


responses track female behavioral responses to complex 


Table 3 Top 10 DEGs in response to different behavioral categories. 
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communication signals after controlling for experimental 
conditions and animals' reproductive states. Interestingly, 
hierarchical cluster analysis suggested a strong association 
between neurogenomic states and the stimuli to which the 
animal was exposed. Meanwhile, the number of DEGs was 
consistent with the behavioral condition difference hierarchy; 
that is, fewer DEGs were produced between the two least- 
differentiated behavioral conditions. These results indicate that 
multimodal communication behaviors may be related to brain 
transcriptional profiles in anurans, which is similar to reports 
on the relationship between whole brain gene expression and 
complex behavior in social insects (Zayed and Robinson, 2012) 


as well as tángara frogs (Eng ystomops pustulosus) (Hoke et al., 


Group Gene. id FDR Expression Gene Description 
Cluster-46846.133914 2.03 x 10” Up Sterol metabolic process//steroid biosynthetic process 
Cluster-46846.159594 2.83 x 10” Down G-protein coupled receptor activity 
Cluster-46846.133915 3.88 x 1077 Up Sterol metabolic process//steroid biosynthetic process 
Cluster-46846.152204 1.83 x 10? Down Extracellular space 

heen Cluster-46846.133918 8.18 x 10? Down Sterol metabolic process//steroid biosynthetic process 

vs. 

Cluster-46846.157580 1.31 x 10” Up Oxidation-reduction//transcription//carbohydrate metabolic 
Cluster-46846.83559 1.59 x 10” Up Protein binding//ATP binding 
Cluster-46846.160021 8.40 x 10” Down Transmembrane transport//oxidation-reduction 
Cluster-46846.164514 9.53 x 107 Down Mismatch repair//lipoprotein particle clearance//lipoprotein 
Cluster-46846.130987 1.20 x 10^ Down Ion channel activity 
Cluster-46846.152204 9.64 x 107" Down Extracellular space 
Cluster-46846.161260 1.26 x 107! Down Estrogen metabolism//exocytosis//steroid hormone metabolism 
Cluster-46846.166932 iL Ss 1" Down Phospholipase inhibitor activity//calcium ion binding 
Cluster-46846.164514 1.26 x 10” Down Mismatch repair//lipoprotein particle clearance//lipoprotein 

ETE Cluster-46846.147885 4.82 x 10? Up RNA binding//nucleic acid binding 

vs. 

Cluster-46846.155071 8.71 x 10” Down Hormone activity//pheromone binding 
Cluster-46846.83559 3.08 x 10% Up Protein binding//ATP binding 
Cluster-46846.160189 1.39 x 10” Down Phosphotransferase activity//DNA binding 
Cluster-46846.144622 6.79 x 10° Down Transposition, DNA-mediated 
Cluster-46846.207219 2.50 x 10°” Up Aerobic respiration 
Cluster-46846.152204 3.63 x 10? Down Extracellular space 
Cluster-46846.164514 6.77 x 107 Down Mismatch repair//lipoprotein particle clearance//lipoprotein 
Cluster-46846.141030 9.67 x 10°° Down GTPase activity//GTP binding//protein binding 
Cluster-46846.155071 1.92 x 10” Down Hormone activity//pheromone binding 

Mv Cluster-46846.83559 3.57 x 10? Up Protein binding//ATP binding 

vs. 
Cluster-46846.139634 3.56 x 107" Down Protein binding 
Cluster-46846.207219 2.69 x 10” Up Aerobic respiration 
Cluster-46846.136571 3.01 x 10? Down Hydrolase activity//ferric iron binding 
Cluster-46846.6262 6.78 x 107 Down Metal ion binding 

Cluster-46846.246981 1.75 x 107° Down Amino sugar metabolism//Golgi vesicle transport 


Note: AV, audiovisual; B, blank; A, auditory; V, visual. 
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2007). When animals were transferred from a dark, quiet 
environment to the playback setup, the frame and running 
water in the blank contrast (in the absence of acoustic and 
visual cue) may have provided visual or acoustic information. 
We therefore included the dark group as a further contrast in 
this study. However, animals can behave differently just sitting 
in the dark compared to any of the other four conditions. In 
order to assess whether the dark effect would significantly 
change the clustering result, we also analyzed the data with the 
dark group excluded. Consequently, the clustering results of 
four conditions (Figure S1) were consistent with results of the 
five conditions analyzed together (Figure 2). Thus, the results 
should be compelling and conclusive. 

There are many challenges to uncover the brain 
mechanisms of multimodal signals using sequencing methods 
(Partan, 2013). For instance, gene expression may be determined 
by behavior as well as by environmental stimulation. Research 
on gene expression changes needs to be conducted under 
strictly controlled internal and external conditions. In this study, 
female reproductive states and all experimental conditions 
were consistently controlled. Moreover, the playback and 
dark treatment times were designed according to several 
gene expression researches on anurans. Thus, our results not 
only demonstrate that the most widespread transcriptome 
technology (ie, RNA-seq) can be a powerful tool for measuring 
brain gene expression in response to complex stimuli, but may 
also improve the experimental design of future research on 
multimodal communication. 

A previous study on birdsong indicates that after song 
exposure, down-regulated genes outnumber the increasing 
ones in the brain (Dong et al., 2009). Interestingly, the same 
result was obtained when little torrent frogs were exposed 
to visual, acoustic, or audiovisual stimuli as compared with a 
blank contrast stimulus, as well as when frogs were exposed 
to an audiovisual versus a visual or acoustic stimulus (Table 
2). Thus, these findings reveal a function of gene expression 
suppression in the brain. It is possible that such a mechanism is 
highly conserved due to its existence in birds and frogs as well 
as in the responses to different stimuli. At present, however, we 
know little about the mechanism of such gene suppression in 
the brain (Dong et al, 2009). It is possible that the suppression of 
gene expression is a homeostatic response evoked by an increase 
in signaling activity (Chew et al, 1995; Stripling et al., 1997). 

Brain energy metabolism has a close relationship with 
animal behavioral phenotypes (Rittschof and Schirmeier, 
2018). However, we have limited knowledge on how energy 
metabolism is linked to the neural mechanisms, which 
ultimately give rise to these behavioral phenotypes (Raichle, 
2015) due to the complexity and challenges of brain function 
exploration. In the little torrent frog, we found that brain 
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energy consumption was linked to differential stimulus 
exposure. The functional classes of the up-regulated genes 
we identified showed that GO terms associated with energy 
metabolism were mostly enriched in the brain when females 
were presented with visual, acoustic, or audiovisual sexual 
stimuli (Table S2). Moreover, the analysis based on the top 10 
DEGs showed a similar result; that is, genes associated with 
energy availability were all found to be up-regulated when 
females processed these stimuli (Table 3). These results are 
consistent with the idea that the female brain utilizes amounts of 
energy for processing different types of sexual signals. Further 
support for this idea is provided by the KEGG annotation 
in which the most significantly enriched cellular metabolic 
pathways were cardiac muscle contraction and oxidative 
phosphorylation. In male Zebra finches (Taeniopygia guttata), 
the majority of nuclear genes associated with mitochondrial 
energetics change significantly in the process of song response 
habituation (Dong et al., 2009). In the little torrent frog, we 
suggest that female preferences for acoustic or visual cues may 
be accompanied by rapid changes in energy metabolism. 
Several neuronal activity-dependent molecular mechanisms 
have been proposed by which external stimuli trigger a 
neurogenomic shift (Wolf and Linden, 2012; Cardoso et al, 2015). 
One possible mechanism depends on the activation, such as by 
phosphorylation, of pre-existing proteins that subsequently 
regulate IEGs or the expression of other response genes, or 
act on the MAPK or other intracellular signaling pathways. 
IEGs are a set of activity-dependent genes that respond 
rapidly to various stimuli and have been commonly used to 
explore neuronal activity in the vertebrate brain (Terleph and 
Tremere, 2006). Many researchers use immunocytochemistry 
or in situ hybridization procedures to explore sensory- 
driven IEG expression in the brains of songbirds and frogs 
evoked by acoustic or visual stimuli. In zebra finches, visual 
information (ie. colored lights) can influence gene responses 
to song stimulation (Bailey et al., 2002; Kruse et al., 2004), while 
pairing visual cues with song stimulation does not increase 
egr-I expression in higher-order auditory telencephalic regions 
including the caudal medial mesopallium (CMM) and caudal 
medial nidopallium (NCM) (Avey et al., 2005). This study was 
a good starting point for gene function related to multimodal 
communication behavior. More research is needed to examine 
whether the neurons in brain areas involved in processing 
audiovisual multimodal signals increase the expression of IEGs. 


5. Conclusions 
In sum, visual and auditory cues conveyed some of the same 


information related to mate-choice and in combination 


increased the sexual attractiveness of one another in little 
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torrent frogs. Sequencing data of whole brain tissue showed 
different neural genomic responses in females exposed to 
different communication behaviors, suggesting that the brain 
transcriptome can be used to track audiovisual behavioral 
preferences as has been demonstrated for behavioral plasticity 
in some social insects. Based on these results, we analyzed energy 
metabolism which has been reported to regulate acoustic 
and visual communication in other animal species. GO and 
KEGG annotation revealed a significant energy metabolism 
response when females were exposed to visual, acoustic, or 
audiovisual stimuli as compared with a blank contrast stimulus, 
but not when comparing an audiovisual versus a visual or 
acoustic stimulus. These findings suggest that behavioral and 
neurogenomic responses to acoustic and visual sexual cues are 
correlated in anurans. Brain activities such as energy use are 
often temporally and spatially dynamic. Future studies on 
these dynamic processes would provide further insights into 
multimodal sensory mechanisms. 
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Appendix 


Figure S1 Hierarchical cluster analysis on DEGs shows that the transcriptional profiles of samples from same treatment group were 
more similar than those from different treatment group. 


Table S1 Summary of A. torrentis brain transcriptomes in all treatments. 


Total number of Total number of N50 length of 


Samples Raw reads Clean reads Q20 (96) GC (96) traniscripts anns Greene 


> 


V2 53260782 50359892 97.06 44.62 


E 


58334344 55177702 96.95 44.52 


> 
[m 


57982260 54955548 96.97 44.69 


< 
N 


51027254 48361148 96.85 44.83 926957 474931 1000 


w 
han 


49154492 46535400 97.12 44.66 


w 
W 


51160540 48640190 96.9 44.43 


D2 54492808 51859038 96.78 44.53 


Note: AV, audiovisual; B, blank; A, auditory; V, visual. Results of transcripts and unigenes were obtained from all samples. 


Table S2 GO annotations of differentially expressed genes in different comparisons. 


The table is available at the website https://github.com/woxinfei/2020/blob/woxinfei-patch-1/Zhao%20et%20al_table%20S2.xlsx 


